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ABSTRACT 

DNA transposases facilitate genome rearrange- 
ments by moving DNA transposons around and 
between genomes by a cut-and-paste mechanism. 
DNA transposition proceeds in an ordered series of 
nucleoprotein complexes that coordinate pairing 
and cleavage of the transposon ends and integra- 
tion of the cleaved ends at a new genomic site. 
Transposition is initiated by transposase recogni- 
tion of the inverted repeat sequences marking 
each transposon end. Using a combination of solu- 
tion scattering and biochemical techniques, we 
have determined the solution conformations and 
stoichiometries of DNA-free Mos1 transposase and 
of the transposase bound to a single transposon 
end. We show that Mos1 transposase is an 
elongated homodimer in the absence of DNA and 
that the N-terminal 55 residues, containing the first 
helix-turn-helix motif, are required for dimerization. 
This arrangement is remarkably different from the 
compact, crossed architecture of the dimer in the 
Mos1 paired-end complex (PEC). The transposase 
remains elongated when bound to a single- 
transposon end in a pre-cleavage complex, and 
the DNA is bound predominantly to one transposase 
monomer. We propose that a conformational 
change in the single-end complex, involving rotation 
of one half of the transposase along with binding of 
a second transposon end, could facilitate PEC 
assembly. 



INTRODUCTION 

Transposable elements are significant components of most 
eukaryotic genomes, shaping genome architecture and 
gene regulatory networks (1,2). By moving from one 



location to another, they can alter gene expression, 
create mutations or promote genome evolution by 
creating novel genes (3,4). DNA transposons of the 
mariner /Tel family are particularly widespread in nature 
(5) and are active in a broad range of species, including 
vertebrates. Because of this, they are being exploited suc- 
cessfully as tools for genetic engineering and gene delivery 
(6,7). 

Mariner/ Tel elements move via a DNA intermediate, in 
a cut-and-paste mechanism that brings the two ends of the 
transposon together (Figure 1). The ends are separated by 
1-2 kb and marked by inverted repeat (IR) DNA se- 
quences. Transposition is orchestrated by a transposase 
enzyme (encoded by the transposon sequence itself) 
within an ordered series of nucleoprotein complexes. 
Initially the specific sequence of the IR DNA is recognized 
by the transposase DNA-binding domain. The two trans- 
poson ends are then brought together, in a paired-end 
complex (PEC), and precisely cleaved at the IRs before 
being inserted at a new genomic site. DNA transposases 
act as oligomers to bring the two transposon ends 
together. This provides at least two catalytic sites for the 
excision reactions at each end as well as the insertion re- 
actions to covalently join the cleaved transposon to a new 
target site. Various strategies for looping of DNA to 
assemble the paired-end transposition complex are 
possible (8). A pre-formed transposase oligomer can 
bind to one transposon end before recruiting the second 
naked end, as occurs during P-element transposition (9), 
or the transposase can oligomerize after initially binding 
separately to each IR or to just one IR. 

The mechanism of transposition of the active mariner/ 
Tel transposon Mosl (from Drosophila mauritiana) has 
been studied from both biochemical and structural per- 
spectives (10-14). The order and requirements for DNA 
excision have been established (10,11), and crystal struc- 
tures have been determined of the transposase C-terminal 
catalytic domain (11) and the paired-end complex of 
transposase and cleaved transposon DNA (14). The PEC 
structure, representing the Mosl transposition machinery 
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Figure 1. Schematic of Mosl transposition. The 1.3 kB transposon has 
28 bp inverted repeats (IR) indicated with arrows. Binding of a 
transposase dimer to one end forms a single-end complex (SEC2). 
The transposon is fully excised from donor DNA in the paired-end 
complex (PEC). Mariner/Tel transposons integrate at TA sequences, 
resulting in signature duplications either side of the inserted 
transposon. 

after DNA cleavage and before transposon insertion into 
target DNA, contains a transposase dimer bound to two 
cleaved transposon ends in a crossed arrangement: each 
28 bp IR sequence is recognized by the N-terminal DNA 
binding of one transposase monomer and by the 
C-terminal catalytic domain of the other monomer. The 
bipartite DNA-binding domain contains two helix-turn- 
helix (HTH) motifs. The N-terminal 55 residues (contain- 
ing HTH1) also form part of the transposase dimerization 
interface within the PEC (14). 

The molecular events occurring earlier in the Mosl 
transposition pathway, leading to PEC formation, 
remain poorly understood. Mosl transposase forms a 
dimer in solution without DNA (11), and it has been 
proposed that initially a transposase dimer binds rapidly 



to one transposon end, in a single-end complex (SEC2) 
(13,15); the N-terminal 35 residues were implicated in 
cm-MosI transposase dimerization, leading to binding of 
a single IR (12). The PEC could then form by recruitment 
of the other naked IR to the complex. Divalent Mg 2+ or 
Mn 2+ ions are required for PEC formation, as well as 
DNA cleavage of mariner transposons (10,13,14,16,17). 
The metal ions are coordinated by a triad of aspartic 
acid residues in the RNase-H-like catalytic core of the 
C-terminal catalytic domain (11). Metal ions may play 
an additional role in stabilizing protein-DNA interactions 
in the PEC and/or promoting the correct conformation of 
this complex (15). It has been suggested that nicking of 
one DNA strand can occur before pairing of the trans- 
poson ends (10,16) and that this cleavage event may 
promote a conformational change that facilitates PEC 
assembly. However, this proposal is controversial; the 
paradigm, established for bacterial DNA transposases, is 
for pairing of transposon ends before DNA catalysis. 
More recently Carpentier et al. (18) argued that PEC 
assembly is indeed required for first strand cleavage of 
Mosl inverted repeats. 

We sought to establish the architecture of the full-length 
Mosl transposase prior to DNA binding, and the con- 
formation of the pre-cleavage SEC2. Attempts to 
crystallize the full-length transposase without DNA have 
not been successful. However, solution scattering methods 
can provide structural parameters and low resolution con- 
formations of proteins and complexes. Neutron scattering 
contrast variation in particular can distinguish the con- 
formations and relative spatial arrangements of the con- 
stituents of a nucleoprotein complex. This technique 
exploits the significantly different scattering length 
densities of protein molecules compared with DNA, and 
of hydrogen compared with deuterium (19). The contrast 
between the solute and the solvent can be modulated by 
altering the solvent D 2 0:H 2 0 ratio. The contrast is 
matched, at a particular D 2 0:H 2 0 ratio, when the 
scatter from a solute molecule equals that of the solvent; 
the scattering from solute is therefore eliminated when the 
scattering from the solvent is subtracted. Thus, it is 
possible to mask the protein component of a complex, 
so that only the DNA component contributes to the 
overall scattering, and vice versa. Incorporation of deuter- 
ium-labelled proteins in a complex extends the range of 
contrasts available beyond those originally developed by 
Stuhrman (20) and reduces the incoherent background 
scatter from hydrogen in the sample. Dedicated facilities 
for the production of deuterated macromolecules have 
been developed within ILL's Life Science Group (the 
D-LAB), and there are now numerous examples 
illustrating the power of deuteration approaches in small 
angle neutron scattering (SANS) (21-24), neutron crystal- 
lography (25) and dynamics (26). 

Here we have combined complementary methods to 
obtain low resolution solution structures of two early 
molecular intermediates in Mosl transpositon: Mosl tran- 
sposase and its pre-cleavage, single-end complex with 
transposon DNA. We show that the DNA-free transpo- 
sase homodimer adopts an extended conformation in 
solution, in contrast to the more compact arrangement 
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of the crossed dimer in the PEC crystal structure. The first 
55 residues of the transposase DNA-binding domain are 
essential for dimerization. The elongated conformation is 
slightly extended in the SEC2 formed under non-catalytic 
conditions, and the DNA is bound predominantly to one 
half of the transposase dimer. We propose that rotation of 
the one transposase monomer about the DNA-binding 
domain, coupled to binding of a second transposon 
DNA end, could provide a mechanism to form the 
compact PEC architecture previously determined by 
X-ray crystallography. 

MATERIALS AND METHODS 

Expression and purification of hydrogenated and 
deuterated Mosl transposases 

Mosl transposase (containing the solubilizing mutation 
T216A), and hereafter referred to as H-Mosl, was ex- 
pressed and purified as previously described (27). 
Partially deuterated Mosl transposase (D-Mosl) was ex- 
pressed in Escherichia coli BL21(DE3) cells in minimal 
medium (28) containing 85% (v/v) recycled D 2 0. Media 
was inoculated with hydrogenated cell culture to 1 % (v/v) 
and incubated at 37°C for 2^1 days. Protein expression 
was induced at 30°C with isopropylthiogalactoside 
(ImM), and the cells were harvested after 5h. D-Mosl 
purification was performed as described previously (27) 
except that cation exchange was performed on a 
SP-Sepharose FF column. The yield of D-Mosl was 
0.75 mg per litre of culture. Proteins were stored at 
-20°C in 25 mM Tris pH 7.5, 250mM NaCl, ImM 
DTT and 50% (v/v) glycerol. 

Mass spectrometry of deuterated Mosl transposase 

The mass of D-Mosl was measured as 42166 Da by 
MALDI mass spectrometry. Thus, the level of deuterium 
incorporation was 52.2% in H 2 0 and 77.7% in D 2 0 
(assuming exchange of all labile hydrogens and the loss 
of the first methionine in the sequence of the peptide). 

Cloning of the N-terminal truncation mutant 
delta-55 Mosl 

The sequence coding for Mosl transposase lacking the 
N-terminal 55 amino acids was amplified from the full- 
length Mosl gene using forward (5'-TACGT CATATG GG 
CGATTTTGATGTGGATG-3') and reverse primers (5'-T 
ACGGCTCGAG 7TL4TTCAAAGT ACTTGCC-3') , where 
the restriction sites are underlined and the stop codon is in 
italics. The 895 bp PCR product was digested with Ndel and 
Xhol enzymes (NEB), gel-purified and ligated (Rapid 
Ligation Kit, Roche) with similarly digested, gel-purified 
and de-phosphorylated pET30a or pET28a vector. The 
integrities of the resulting clones (pET30-D55 and pET28- 
D55) were confirmed by Sanger sequencing. 

Expression and purification of deletion mutants 

The delta-55 Mosl mutant without a tag was expressed 
from the pET30a-D55 plasmid. The 290 residue protein 
(34.34 kDa) was purified by cation exchange and 



gel-filtration chromatography, as for full-length 
transposase. The N-terminal 130 amino acids of Mosl 
were expressed as previously described (29), and purified 
by elution from an 1MAC FF column (GE Healthcare) 
with Imidazole, followed by cation exchange 
chromatography. The N-terminal six His tag was 
removed by thrombin cleavage (5 units/mg of purified 
protein) in PBS at 17°C for 16 h prior to gel-filtration. 
The resulting protein contained 133 amino acids (with a 
molecular mass of 15.5 kDa). 

Preparation of transposase samples for SANS 
experiments 

H-Mosl and D-Mosl protein samples were exchanged 
multiple times (in a viva spin concentrator or by dialysis, 
respectively) into fully hydrogenated and fully deuterated 
buffers containing 25mM Tris (pH or pD 7.5), 350 mM 
KC1 and ImM DTT. Samples of D-Mosl in buffer 
containing 30% (v/v) D 2 0 or 65% (v/v) D 2 0 were then 
prepared by mixing together appropriate volumes of D- 
Mosl in fully hydrogenated and fully deuterated buffer. 
The concentration of each sample was calculated from 
triplicate measurements of absorbance at 280 nm on a 
Shimadzu UV-2401PC UV- Visible spectrophotometer. 

DNA substrates 

The DNA duplex for SEC2 was prepared by annealing 
two complementary 50-mers containing the 28 base IR 
sequence specifically recognized by transposase, 
surrounded by 8 bases of transposon sequence and 14 
bases of flanking DNA. These had the sequences (5' ttt 
aaa aa AAA CGA CAT TTC ATA CTT GTA CAC CTG 
A tag ttt eta tat tc) and (5' gaa tat aga aac taT CAG GTG 
TAC AAG TAT GAA ATG TCG TTT ttt tta aa), where 
the IR sequence is in upper case. Oligonucleotides were 
synthesized and PAGE purified by Integrated DNA 
Technologies. Lyophilized samples were dissolved in 
either H 2 0 or D 2 0 and annealed by heating molar 
equivalents of complementary strands to 90°C for lOmin 
and cooling to 25°C in steps of 2°C per 30 s. 

Preparation of transposase DNA complexes 

Single-end complexes were prepared by adding either 
H-Mosl or D-Mosl to DNA in the molar ratio 2:1 in 
20 ul aliquots. For contrast variation experiments, four 
complexes were prepared, with different ratios of 
H 2 0:D 2 0 in the buffer. Two complexes (H-SEC2) 
contained H-Mosl and 0% (v/v) or 100% (v/v) D 2 0 
and two contained D-Mosl in buffers with 65% (v/v) or 
100% (v/v) D 2 0 (D-SEC2). Sample concentrations were 
calculated from the average of three absorbance 
measurements at 280 nm and 260 nm. 

Gel-Filtration of transposase-DNA complexes 

Samples were separated at room temperature on a 
Superdex 200 10/300GL column (GE Healthcare). Iso- 
cratic elution with buffer containing 20 mM Tris pH 7.5, 
0.3 M KC1 and 1 mM DTT was monitored by absorbance 
at 260 nm and 280 nm. Eluted fractions containing SEC2 



Nucleic Acids Research, 2013, Vol. 41, No. 3 2023 



were loaded onto a 15% SDS-PAGE along with 6 internal 
standards containing 150ng, 300 ng or 450 ng of Mosl 
transposase or DNA. The gel was silver-stained for 
protein and DNA, and scanned using a BioRad GelDoc 
EZImager (GE Healthcare). The intensity of bands was 
quantified with EZImager using a uniform box size and 
mean background subtraction. 

Size-Exclusion Chromatography Multi-Angle Laser Light 
Scattering (SEC-MALLS) 

The molar mass of transposases, duplex DNA and the SEC2 
complex was determined by SEC-MALLS. Samples were 
separated at room temperature on a Superdex 200 10/ 
300GL column linked to either an AKTA Ettan (GE 
Healthcare) or Shimadzu HPLC system. The column was 
pre-equilibrated with at least two column volumes of buffer 
(as above). Elution was performed isocratically at 0.5 ml/ 
min and monitored by absorbance at 280 nm. The HPLC 
system was connected to a DAWN HELIOS II™ MALLS 
instrument (Wyatt Technology) and Optilab T-rEX 
refractometer (Wyatt Technology). On-line measurement 
of the intensity of the Rayleigh scattering as a function of 
the angle of the eluting peaks was used to determine the 
weight average molecular masses (M w ) of the eluted 
samples, using the ASTRA™ (Wyatt Technologies) 
software. 

Small angle X-ray scattering (SAXS) data collection and 
processing 

SAXS data were collected at the European Synchrotron 
Radiation a Facility (ESRF) (beam line ID14-3, 
X = 0.931 A) on a Pilatus 1 M detector. The sample to 
detector distance was 2.43 m, and scattering data were 
collected within the momentum transfer (q) range 
0.001-0.35 A -1 . To avoid radiation damage, the sample 
(30 ul) was pushed through the capillary cell during data 
collection. Data were collected in multiple 30 s frames, 
inspected and averaged in PRIMUS(30) normalized to 
the incident beam intensity and the scattering of the 
buffer subtracted. To check for concentration-dependent 
effects, scattering data were collected from H-Mosl 
samples at 1.45mg/ml, 2.9mg/ml and 4.4mg/ml. SAXS 
data were calibrated against bovine serum albumin (BSA) 
at 5.6mg/ml. The molecular masses of the transposase and 
the complex were calculated from the extrapolated 
intensity at zero angle 1(0), obtained with GNOM, by 
comparison with the 1(0) of BSA (molecular mass 66 kDa). 

SANS data collection and processing 

All SANS data were collected at the high neutron flux 
reactor at Institut Laue Langevin on beam line D22 
equipped with a 3 He Reuter-Stokes® multi-detector. 
Samples (minimum volume 200 ul) were placed in 
1.0 mm path length quartz cuvettes, sealed and transferred 
to a sample changer maintained at 23°C. Scattering data 
were collected at sample-detector distances of 2 m and 8 m 
for 900 s and 3600 s respectively. Transmission data were 
collected at a sample-detector distance of 8 m for 180 s. In 
each case the collimation distance was equal to the 
sample-detector distance. 



SANS data were reduced in GRAS ans P v6.01 beta 
(www.ill.fr/lss/grasp). Data were normalized to the 
incident beam intensity and corrected for transmission, 
cuvette thickness and detector efficiency. Scattering from 
the sample holder and instrument background were then 
subtracted and the data averaged radially around the 
beam centre and normalized. Data collected at 2 m and 
8 m sample-detector distances were then scaled and 
merged in MS-EXCEL. The range of momentum 
transfer sampled was 0.007 <q< to 0.35 A -1 . The Rg 
and 1(0) values for each solute were extrapolated from a 
Guinier plot of the very low angle scattering data in 
PRIMUS and GNOM. Molecular masses of the 
hydrogenated solutes were estimated using water 
scattering as a reference (31,32). 

Molecular modelling of Mosl transposase 

Low resolution protein structure models were reconstruc- 
ted by ab initio simulated annealing, representing the 
protein as a chain-like ensemble of dummy residues in 
GASBOR (33). Ten models of Mosl transposase were 
generated in separate runs, and aligned pair-wise with 
SUPCOMB (34) to compute the normalized spatial 
discrepancy (NSD) between models. The quaternary 
structure of Mosl transposase was modelled with 
SASREF (35) on the ATSAS server (http://www.embl- 
hamburg.de/biosaxs/atsas-online). A dimeric arrangement 
of subunits was constructed by simulated annealing using 
the atomic structure of a transposase monomer from the 
PEC crystal structure (PDB ID: 3HOT) and minimized 
against the experiment SAXS data. 

MONSA analysis 

Low resolution models of the SEC2 complex were 
calculated using MONSA (36) at the ATSAS server. 
Five transposase-DNA complex data sets were included: 
one SAXS data set of H-SEC2, and four SANS data sets 
of H-SEC2 or D-SEC2 at different H 2 0:D 2 0 ratios. 
Contrasts for each component of the complex were 
based on the isotopic composition of the sample for a 
given D 2 0 buffer composition (Supplementary Table SI). 



RESULTS 

H-Mosl transposase structural parameters 

SAXS data were collected from a solution of hydrogen- 
ated transposase (H-Mosl) at 1.8mg/ml (Figure 2A), and 
SANS data were collected from solutions of H-Mosl in 
100% H 2 0 and 100% D 2 0 (Figure 2B). By Guinier 
analysis, the average radius of gyration (Rg) of the 
transposase was 50.3 ±2. 3 A (Table 1). The molecular 
weight of the H-Mosl transposase was calculated to be 
87.5 ± 7.5 kDa, from the scattering intensity extrapolated 
to zero angle (1(0)). This compares with a theoretical 
molecular mass of the transposase monomer of 
40.7 kDa. Thus the SAXS and SANS data confirm that 
Mosl transposase is a dimer in solution, consistent with 
previous conclusions from gel-filtration analysis (11). 
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Table 1. Structural parameters for Mosl extracted from experimental SANS and SAXS data and the PEC crystal structure 





Sample 


% D 2 0 


Cone (mg/ml) 


K0) 


MW (kDa) 


Rg (A) Guinier 


Rg (A) Gnom 


Dmax (A) 


SAXS 


H-Mosl 


0 


1.8 


13.5 


79.9 


49.2 ± 0.1 


49.5 ± 0.2 


185 


SANS 


H-Mosl 


0 


7.5 


0.433 


96.2 


51.9 ± 0.7 


53.4 ± 0.5 


180 




H-Mosl 


100 


S.8 


0.913 


89.5 


51.2 ± 0.4 


55.9 ± 0.2 


180 


SANS 


D-Mosl 


0 


1.1 


0.38 


86.4 a 


51.5 ± 0.7 


55.1 ± 0.6 


180 




D-Mosl 


30 


1.1 


0.15 




51.1 ± 1.5 


53.7 ± 0.9 


180 




D-Mosl 


65 


1.2 


0.049 




53.7 ± 2.1 


53.0 ± 2.1 


180 




D-Mosl 


100 


1.2 


0.01 










Crystal Structure 


Mosl 








81.3 




38 b 


110 



The coordinates for the Mosl dimer in the PEC were extracted from the PDB file 3HOT. 
"Renormalized for mass of H-Mosl. 
b Caculated using CRYSOL. 



The SAXS and SANS data were transformed using 
GNOM into a distribution of paired distances, P(r), of 
all inter-atomic vectors in H-Mosl; the P(r) distribution 
calculated from the SAXS data is shown in Figure 2C. The 
maximum dimension (Dmax) of the transposase dimer 
was in the range 180-185 A (Table 1). 



SANS of D-Mosl transposase 

The use of deuterated transposase (D-Mosl) in neutron 
scattering extends the range of contrasts available for 
contrast variation experiments; proteins with ~75% 
deuterium incorporation are contrast matched in 100% 
(v/v) D 2 Q. We prepared partially deuterated D-Mosl as 
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described in the methods. To determine experimentally the 
contrast match point of the sample, and to confirm its 
integrity, we measured neutron scattering curves of 
D-Mosl in buffer containing 0, 30, 65 or 100% (v/v) 
D 2 0 (Figure 2D). The structural parameters Rg and 
Dmax were consistent with those measured for H-Mosl 
(Table 1). The scattering amplitude at zero angle was 
calculated for each curve and plotted against the % D 2 0 
content of the buffer (Supplementary Figure SI). The 
linear best fit to these points intercepted the x-axis at 
92% (v/v) D 2 0, establishing the contrast match point 
for D-Mosl . 

Comparison of the DNA-free and PEC Mosl dimers 

We compared the experimentally measured structural 
parameters Rg and Dmax for the H-Mosl homodimer 
in solution with those values calculated using the 
crystallographic atomic coordinates of the transposase 
dimer within the Mosl PEC. Coordinates for all but the 
N-terminal 3 residues of the sequence (which are likely to 
be disordered) are defined in the crystal structure. The 
theoretical Rg for the compact and crossed arrangement 
of the PEC dimer was 38 A (compared with the measured 
value of 50.3 A), and the Dmax was significantly shorter at 
~110A (Table 1). A theoretical scattering curve for the 
transposase was calculated with CRYSOL using the 
crystallographic atomic coordinates (Supplementary 
Figure S2A), but the agreement between the theoretical, 
PEC model-based scattering curve and the experimentally 
measured SAXS data was poor (x = 12.1). This indicates 
that the conformation of the transposase dimer when free 
in solution is significantly different to that when bound to 
two transposon DNA molecules in the PEC. 

Shape of the Mosl Dimer 

Next we performed ab initio reconstruction of the shape of 
the transposase dimer in solution from the SAXS data. 
The protein was considered as a chain-like ensemble of 
dummy residues in the program GASBOR (37), with no 
symmetry restraints imposed. A gallery of 10 solutions are 
shown in Figure 3A. Calculations were also performed 
with P2 symmetry imposed and these gave similar results 
(Supplementary Figure S3). The molecular envelopes each 
have a markedly elongated, narrow shape, which contrasts 
with the more compact, crossed architecture of the 
transposase dimer in the Mosl PEC (Figure 3B). 

Relative orientation of domains 

To establish how the monomers are arranged in the DNA- 
free transposase homodimer, we modelled the elongated 
quaternary structure by multi-subunit rigid body calcula- 
tions. The atomic coordinates of a transposase monomer 
from the PEC were used as the starting point. An 
exhaustive grid search of homodimer configurations, and 
their fit to the experimental SAXS data, was performed. 
To allow for potential variations in the orientations of 
domains about flexible linkers, the monomer structure 
was split into three separate domains: HTH1 contained 
residues Val 5 to Gly 56, HTH2 comprised residues Asp 
57 to Gly 1 17 and the catalytic domain spanned Arg 1 18 to 



Glu 345. Conditional distance restraints of 7 A were defined 
between terminal residues of the three separated subunits 
adjacent in the protein sequence. 

Two possible models for the homodimer emerged from 
this analysis, each fitting reasonably to the SAXS curve. In 
the first tail-to-tail model, shown in Figure 3C, the two 
catalytic domains are in contact at the centre of the 
elongated homodimer, with the DNA-binding domains 
distal from each other at the peripheries. This model 
fitted to the SAXS curve with / = 1.94 (Supplementary 
Figure S2B). In the second head-to-head model, the 
DNA-binding domains are in close proximity at the 
centre of the homodimer (Figure 3D), with the HTH1 
domains in contact. This model fitted to the SAXS data 
equally well, with / = 1.98 (Supplementary Figure S2C). 
Head-to-tail models of the homodimer, in which the 
DNA-binding domain of one monomer contacts the 
catalytic domain of the second monomer (Figure 3E), 
had significantly lower quality fits to the experimental 
data, with / = 21.0 (Supplementary Figure S2D). 

To distinguish between the tail-to-tail and head-to-head 
models, we created deletion mutants of Mosl transposase 
(Figure 4A). We hypothesized that deletion of HTH1 
would have little effect on transposase dimerization in 
the tail-to-tail model of the elongated dimer (Figure 3C) 
but would disrupt dimerization if the DNA-binding 
domains were in contact, as in the head-to-head model 
(Figure 3D). The converse would be true for mutants in 
which the catalytic domain had been deleted. 

A deletion mutant of Mosl transposase lacking the 
N-terminal 55 residues (delta-55 Mosl) was expressed 
and purified (Figure 4B). Analysis by SEC-MALLS esta- 
blished that delta-55 Mosl eluted from the gel-filtration 
column after 30min with an average M w of 
33.7 kDa ± 2%, (Figure 4C). This mass corresponds to a 
monomeric species. By comparison, the full-length 
H-Mosl homodimer eluted at 28.2 min and had an 
average M w of 78.9 kDa ± 2%, consistent with a dimer. 
Thus deletion of the N-terminal 55 residues (containing 
HTH1) results in loss of transposase dimerization. 

Next we performed the converse experiment. We 
expressed and purified the DNA-binding domain of 
Mosl, containing only the N-terminal 130 amino acids, 
as described previously (29) (Figure 4B). SEC-MALLS 
analysis showed that this deletion mutant eluted at 
29.9 min with a molecular mass of 32.7 kDa (Figure 4C) 
and thus formed a dimer, like the full-length transposase. 
A minor proportion of the sample (<1.5%) eluted after 
only 26.1 min and had a molecular mass of 68.3 kDa. 
A silver-stained SDS-PAGE of the eluted fractions 
confirmed the minor component is a tetramer of trans- 
posase (Supplementary Figure S4). Taken together these 
results are consistent with the head-to-head model of the 
Mosl homodimer in which the DNA-binding domains are 
at the centre of the elongated molecule, with the HTH1 
domains in close contact as shown in Figure 3D. 

Stoichiometry of the pre-cleavage single-end complex 

Previously we proposed that Mosl transposition is 
initiated by a transposase dimer binding to the inverted 
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NSD 1.56 1.41 1.50 1.62 1.59 

X fit 1.53 2.14 1.89 1.67 1.74 



Figure 3. Solution conformations of Mosl transposase. (A) Gallery of ten spherical bead models of the H-Mosl dimer calculated from the SAXS 
data in GASBOR, with no symmetry imposed. The x of the fit to the experimental SAXS data and the NSD between models is indicated below each 
model. (B) Compact structure of the Mosl dimer in the PEC crystal structure (from PDB ID: 3HOT); one Mosl monomer is blue and the other 
orange. (C) An elongated tail-to-tail Mosl dimer with a catalytic domain dimerization interface (C-C model). (D) Alternative elongated head-to-head 
model with a DNA-binding domain interface (N-N model). (E) The elongated head-to-tail (N-C) model fitted poorly to the scattering data. 



repeat sequence at one transposon end (14), forming a 
single-end complex (SEC2). This complex has been 
observed by EMSA under non-catalytic conditions: for 
example, when the Mg 2+ ions required for PEC formation 
and DNA cleavage are excluded from the reaction, or if 
one of the catalytic Asp residues that coordinates Mg 2+ is 
mutated to Ala [Supplementary Figure S6 in ref (14)]. We 
prepared a single-end complex (H-SEC2) by mixing 
H-Mosl with a 50 mer DNA duplex containing the 
28 bp transposase inverted repeat recognition sequence 
(as described in the methods). SEC-MALLS analysis 
was performed to confirm the homogeneity and 
composition of H-SEC2. Samples of the DNA duplex 
and H-Mosl were also analysed separately as controls. 
The SEC2 was loaded onto the gel-filtration column at 
an initial concentration of 7uM, and eluted in a single 
peak with an average molecular mass, measured by 
MALLS, of 109.9 kDa ± 0.5% (Figure 5A). This is close 
to the expected mass of 1 12 kDa for a complex containing 
a dimer of transposase (81 kDa) and a single ds DNA 
molecule (31 kDa). By comparison, the predicted mass 
of SEC1 (transposase monomer bound to one DNA 
duplex) is 7 1.4 kDa and the PEC (dimer bound to two 
DNA molecules) is 143 kDa. 

To confirm the content and stoichiometry of the 
complex, we analysed fractions of SEC2, eluted by gel- 
filtration chromatography, by SDS-PAGE silver-stained 
for both protein and DNA (Figure 5B). We quantified 
the number of pmoles of transposase and DNA seen in 
each lane containing SEC2, by comparing the band 



intensities with those of three internal protein controls 
or three DNA controls, respectively. The molar ratio of 
transposase to DNA in the complex was estimated to be 
1.8, compared with a ratio of 2 expected for SEC2. 

Further confirmation of the stoichiometry of SEC2 was 
obtained from the ratio of UV absorbance at 260 nm and 
280 nm during gel-filtration (Figure 5C). At the gel- 
filtration peak maximum this ratio was 1.50. We calculated 
the theoretical UV absorbance ratios for SEC1, SEC2 and 
PEC by summing the extinction coefficients of the Mos 1 
and DNA components according to the stoichiometry of 
each complex (Supplementary Table S2). The theoretical 
absorbance ratios are 1.49 for SEC2 and 1.63 for both 
SEC1 and the PEC. Thus the experimentally measured 
ratio of 1.50 is consistent only with SEC2. 

Solution scattering of SEC2 and contrast variation 

To establish the solution conformation of the pre-cleavage 
complex, we measured the small angle scattering from five 
SEC2 samples, prepared with a 50 mer DNA duplex 
(containing the 28 bp 1R DNA sequence) and either 
H-Mosl or D-Mosl (Table 2). A range of contrasts 
between the components of the complex and the buffer 
were used in the neutron scattering experiments to 
enable the structural parameters, shape and relative 
spatial arrangement of the protein and DNA to be 
extracted. The use of deuterated transposase extended 
the range of contrasts available and provided enhanced 
scattering from the transposase, despite the lower 
concentration available. 
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Figure 4. Analysis of Mosl transposase deletion mutants. (A) Domain arrangements of full-length Mosl, the HTH1 domain deletion mutant delta- 
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The scattering data collected are shown in Figure 6A. In 
the SANS experiments, the relative contribution of each 
component of the complex to the total scattering intensity 
is dependent on the H 2 0:D 2 0 ratio of the solvent in each 
sample, as well as the deuteration level of the protein 
(Supplementary Table S2). In particular, at 100% (v/v) 
D 2 0 solvent, the transposase component of D-SEC2 is 
close to the contrast match point and the measured 
scattering is dominated by the DNA component. 



Conversely, at 65% (v/v) D 2 0 solvent content, the DNA 
is contrast matched and the measured scattering intensity 
is primarily from the protein in the complex. The 
scattering intensities measured from H-SEC2 in buffer 
containing 0% (v/v) or 100% (v/v) D 2 0 reflect the 
parameters and conformation of the whole complex. We 
also collected SAXS data on the H-SEC2 complex and in 
this case the contrast of the DNA relative to the solvent 
was twice that of the protein. 
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SEC2 model-independent structural parameters 

We measured the molecular mass of H-SEC2 from the 
SANS experiment in 100% (v/v) H 2 0, and the value of 
109.2 kDa was in agreement with the mass measured by 
SEC-MALLS. Consistent with this, the average Rg of 
H-SEC2 was 60.4 ±0.8 A (an increase from 50.3 A for 
the protein in the absence of DNA) and the average 
Dmax was 220 A (Figure 6B), suggesting that the 
complex is ~35^10A longer than the DNA-free 
transposase dimer. From the scattering in 65% (v/v) 



D 2 0, where the DNA is masked, the Rg of the transposase 
within D-SEC2 was measured as 59.1 ± 3.5 A. This is 
larger than the Rg of the DNA-free transposase dimer 
(53.7 ± 2.1 A) and indicates that the protein conformation 
has opened up upon binding DNA and forming SEC2. 
This is also reflected by the larger Dmax of the transposase 
in D-SEC2 (Figure 6C): 190 A compared with 180 A for the 
DNA-free Mosl dimer. From the scattering in 100% D 2 0, 
the Dmax of the DNA was 125 A (Figure 6D). This 
compares with a Dmax of 140 A and Rg of 40.2 A 
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for the DNA duplex alone (SAXS data not shown). Taken 
together these structural parameters indicate that the SEC2 
has an elongated shape and that the transposase dimer has 
a slightly more elongated conformation within the complex 
than in its DNA-free state. 

Next we used the Rg values for the complex and its 
components to establish the placement of the DNA 
relative to the transposase in H-SEC2. The distance 
between the centres of mass of the transposase and 
DNA can be related to the measured Rg values by the 
parallel axes theorem (38), given by the equation: 

R g' S EC2)= xRgfMosn+t 1 " ^RgjoNAi+xO " X ) L " 

where L is the distance in A between the centres of mass of 
the transposase and the DNA, and x is the fraction of total 
scattering contributed from Mosl transposase (38). The 
Mosl dimer contributes 80% of the volume of H-SEC2. 
Because the protein has a contrast of 2.5 in 100% H 2 0 
and the contrast of the DNA is 4.5, the fraction (x) of the 
scattering from Mosl is 0.69 (Supplementary Table SI). 
Using the measured Rg values of the complex and its 
protein and DNA components (58.1, 59.1 and 42.4 A, 
respectively), the distance L is ~44A. Thus the centre of 
the DNA is displaced from the transposase centre of mass 
by a distance corresponding to approximately one fifth of 
the Dmax of the complex. This indicates that the DNA is 
bound predominately to one half of the transposase dimer. 

Conformation of the pre-cleavage complex 

Next we used multi-phase analysis of the scattering curves, 
collected at the different phase contrasts, to calculate a 
low-resolution envelope for the whole complex in which 
the transposase and DNA components are also distin- 
guished. A gallery of the average bead models from 
eight separate simulations, performed using MONSA, is 
shown in Figure 6E. Models were aligned to each other 
using DAMAVER and the NSD is indicated below each 
model. The DNA phase has a long narrow shape 
consistent with the conformation of linear B-form DNA 
and is localized predominantly to one half of the complex. 
A B-form DNA duplex of 50 nucleotides can be 
superimposed on this phase (Figure 7). The transposase 
phase adopts an elongated conformation which is 
extended compared with the DNA-free Mosl dimer, 
with a small dimerization interface. Two monomers of 
Mosl can fit into the envelope (fitting performed by eye) 
so that the DNA is predominantly associated with one 



monomer, while the other monomer is presumed to be 
more flexible. 

DISCUSSION 

Cut-and-paste DNA transposition is orchestrated by 
transposase, which promotes the DNA cleavage and 
joining reactions. The initial steps of Mosl transposition 
involve sequence-specific binding of transposase to the 
inverted repeat sequences at the transposon ends. Full 
excision of the transposon from the donor site occurs 
only after the two ends are brought together in a PEC 
containing a transposase dimer. To investigate how the 
PEC may form, we have established the stoichiometry 
and low resolution conformations of the early 
intermediates in the pathway: that is, the DNA-free full- 
length transposase and the pre-cleavage complex of 
transposase and a single transposon end (SEC2). 

Using solution scattering techniques, we have confirmed 
that the transposase is a homodimer in solution without 
DNA and established that it adopts an extended 
conformation. This arrangement differs markedly from 
the more compact, crossed architecture of the dimer 
when bound to two pre-cleaved DNA molecules in the 
PEC crystal structure. The elongation is reflected by the 
larger measured radius of gyration (50 A) and maximum 
dimension (180 A) of the transposase dimer in solution 
compared with values calculated for the protein 
component of the PEC (Rg = 38 A and Dmax = 110 A). 
These results suggest that the transposase dimer changes 
conformation prior to PEC formation, either upon 
binding to one transposon DNA end or during the 
synapsis of the two ends. 

To establish how the transposase domains are arranged 
within the elongated dimer, and which domains are 
essential for transposase dimerization in the absence of 
DNA, we prepared deletion mutants of Mosl transposase. 
These lacked either the N-terminal 55 amino acids 
(containing HTH1) or the C-terminal catalytic domain. 
We found that the transposase was a monomer when the 
N-terminal 55 residues were deleted, whereas the mutant 
lacking the catalytic domain could still form dimers in 
solution. The elongated transposase dimer is therefore 
likely to be arranged with the DNA-binding domain at 
the central interface (as proposed in Figure 3D). An 
alternative model in which the catalytic domains interact 
and the DNA-binding domains extend to the dimer 
peripheries is not consistent with these results and is 
discounted. 
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Figure 6. Small angle scattering and solution conformations of SEC2. (A) SAXS and SANS data of SEC2 complexes prepared with H-Mosl and 
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(E) Gallery of 8 bead models of SEC2 generated using MONSA, containing a DNA phase (green) and the protein phase (grey). Models were aligned 
in DAMAVER and the NSD for each model is indicated. 



Transposase dimerization via the N-terminal 55 
residues may be a molecular feature that is maintained 
throughout the Mosl transposition pathway. In the PEC 
structure (14), this domain establishes a hydrophobic 
dimerization interface as well as binding sequence 
specifically to the inner 8 bp of IR DNA, via the HTH1 
motif. The N-terminal 30 residues are essential for 
sequence-specific IR DNA binding (12,29), and deletion 



of the first 34 or 50 residues resulted in the loss of 
transposition activity in a plasmid-based transposition 
assay in E. coli (18). Weak transposition activity was 
regained, however, when the first 34 residues were 
replaced by either the dimerization domain of Gal4 or a 
leucine zipper dimerization domain (18). Taken together 
these data suggest that sequence-specific interaction of the 
HTH1 motif with IR DNA is not vital for Mosl 
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Figure 7. Schematic mechanism of early steps in Mosl transposition. Transposase (blue and orange monomers) exits as an elongated dimer in 
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DNA phase. The PEC could form by rotation of the flexible monomer about the HTH1 domain, accompanied by binding of the second transposon 
end. There may also be exchange of DNA between the two catalytic domains. Solution structures of the Mosl dimer and SEC2, and the PEC crystal 
structure (3HOT) are shown below the schematic. 



transposition, whereas transposase dimerization via the 
N-terminal residues is required. 

Oligomerization prior to sequence-specific DNA 
binding has been observed for other eukaryotic DNA 
transposases, and may be a general feature regulating 
early steps in eukaryotic transposition. The P element 
transposase from Drosophila melanogaster exists as a 
pre-formed tetramer that binds to one of the two P 
element ends (9). The core RAG1 recombinase is either 
a dimer or trimer and associates with two RAG2 subunits 
to assemble on one recombination signal sequence, as a 
hetero-oligomer, before synapsis (39,40). The 
N-terminally truncated hAT transposase Hermes, from 
Musca domestica, exists as a hexamer both in solution 
and in crystals (41). In the marinerjTcl family, the 
N-terminal 57 amino acids of Sleeping Beauty transposase 
have been implicated in tetramerization of the DNA- 
binding domain in complex with transposase-binding 
sites (42), and the N-terminal region of Himarl is 
involved in a protein-protein interaction interface (43). 
By contrast, the bacterial Tn5 transposase is most likely 
a monomer when not bound to DNA (44). It is thought 
that a conformational change in the free transposase 
monomer could allow for DNA binding and transposase 



dimerization (45) in the context of the synaptic complex 
containing two transposon ends (46). 

Mosl transposition is initiated by sequence-specific 
binding of transposase to a transposon end. Here we 
used SEC-MALLS, small angle scattering, gel analysis 
and UV absorbance measurements to establish that the 
in vitro single-end complex contains a transposase dimer 
bound to one transposon DNA duplex. Using contrast 
variation neutron scattering experiments, we measured 
the structural parameters Rg and Dmax for the complex 
and its two constituent molecules. These revealed model- 
independent information about the shape, dimensions and 
relative spatial arrangement of the transposase and DNA 
in SEC2. We found that, like the transposase dimer, SEC2 
has an elongated architecture, but the transposase in the 
complex has become slightly more extended upon binding 
DNA. The dimensions of the DNA phase within SEC2 are 
consistent with a linear B-form DNA duplex, and the 51 A 
separation between the centres of mass of the DNA and 
transposase suggests that the DNA is bound predomi- 
nantly to one half of the transposase dimer. 

The low resolution structural models of the SEC2 in 
solution illustrate the elongated shape and the relative 
spatial arrangement of the transposase and DNA within 
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the complex. As predicted from analysis of the structural 
parameters, the DNA phase is seen to associate primarily 
with one half of the transposase dimer; the other half of 
the protein phase has fewer contacts with the DNA and is 
presumed to have more conformational freedom. The 
similarities in the architectures of the transposase in 
SEC2 and in solution suggest that binding of the pre- 
formed transposase dimer to the first transposon end 
occurs without major conformational changes in either 
the protein or the IR DNA. 

Comparison of the solution conformations of SEC2 and 
the PEC crystal structure suggests that a major 
conformational change in the transposase would occur 
during a transition from SEC2 to PEC. The data and 
models presented here provide a basis to propose a 
mechanism for this conformational change. We speculate 
that the transposase transitions from the elongated open 
dimer form when bound to one transposon end to the 
more closed compact arrangement seen in the PEC, by 
rotation of one monomer of transposase by ~180° with 
respect to the other. The N-terminal 55 residues of the 
DNA-binding domain may be the pivot for this 
rotation. As depicted in Figure 7, rotation of the free 
half of the transposase dimer about the DNA-binding 
domain could promote rearrangement of the complex. 
This rotation may be in concert with or triggered by 
binding of a second transposon DNA end, which would 
promote DNA looping. 

One controversial issue regarding the Mosl 
transposition mechanism remains outstanding: does 
nicking of the first DNA strand occur before or after 
synapsis of the Mosl transposon ends? Dawson and 
Finnegan (10) reported that first strand cleavage was not 
dependent on prior PEC formation. Similarly, catalysis of 
Himarl without synapsis of the ends was reported (16). 
Nicking of one strand prior to PEC formation could 
trigger a conformational change in SEC2, which then 
facilitates pairing of the two ends. This could provide 
one molecular explanation for metal ion dependence of 
PEC formation. 

The SEC2 to PEC model raises questions about how the 
crossed (or trans) arrangement of transposase and IR 
DNA observed in the PEC could arise. The modular 
nature of the transposase, with long flexible linkers 
between the DNA-binding and catalytic domains, could 
allow for sub-unit exchange from one IR DNA molecule 
to the other, during major conformational changes in the 
transposase. However, it remains to be established 
whether this exchange occurs before or after PEC 
formation or between cleavage events. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Tables 1-2 and Supplementary Figures 
1-4. 
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